Connected digit recognition using short and long duration models
نویسندگان
چکیده
In this paper we show that accurate HMMs for connected word recognition can be obtained without context dependent modeling and discriminative training. We train two HMMs for each word that have the same, standard, left to right topology with the possibility of skipping one state, but each model has a different number of states, automatically selected. The two models account for different speaking rates that occur not only in different utterances of the speakers, but also within a connected word utterance of the same speaker. This simple modeling technique has been applied to connected digit recognition using the adult speaker portion of the TI/NIST corpus giving the best results reported so far for this database. It has also been tested on telephone speech using long sequences of Italian digits (credit card numbers), giving better results with respect to classical models with a larger number of densities.
منابع مشابه
Using Duration Information in Cantonese Connected-Digit Recognition
This paper presents an investigation on the use of explicit statistical duration models for Cantonese connected-digit recognition. Cantonese is a major Chinese dialect. The phonetic compositions of Cantonese digits are generally very simple. Some of them contain only a single vowel or nasal segment. This makes it difficult to attain high accuracy in the automatic recognition of Cantonese digit ...
متن کاملExplicit duration modeling for Cantonese connected-digit recognition
This paper describes a study on using explicit duration models in hidden Markov model (HMM) based Cantonese connecteddigit recognition. An HMM does not give explicit control to the temporal structure of speech. As a result, the recognition output may exhibit unreasonable duration pattern, which is often accompanied with the presence of recognition errors. We propose to use a duration model that...
متن کاملDuration Modeling in Mandarin Connected Digit Recognition
Digit string recognition is required in many applications which need to recognize numbers such as telephone numbers, credit card numbers, date, etc. In order to design a high performance recognizer, duration information is explored in this study. In a Mandarin connected digit recognizer, insertion and deletion errors amount to more than two thirds of the total recognition errors because there e...
متن کاملConnected Digit Recognition with Class Specific Word Models
This work focuses on efficient use of the training material by selecting the optimal set of model topologies. We do this by training multiple word models of each word class, based on a subclassification according to a priori knowledge of the training material. We will examine classification criteria with respect to duration of the word, gender of the speaker, position of the word in the utteran...
متن کاملContext-dependent word duration modelling for robust speech recognition
Conventional hidden Markov models (HMMs) have weak duration constraints. This may cause the decoder to produce word matches with unrealistic durations in noisy situations. This paper describes techniques for modelling context-dependent word duration cues and incorporating them directly in a multi-stack decoding algorithm. The proposed model is capable of penalising duration constraints of a wor...
متن کامل